Skip to content

Improve shell capturing in init containers#746

Merged
maltesander merged 11 commits intomainfrom
fix/hdfs-container-scrtipt-logging
Feb 4, 2026
Merged

Improve shell capturing in init containers#746
maltesander merged 11 commits intomainfrom
fix/hdfs-container-scrtipt-logging

Conversation

@maltesander
Copy link
Member

@maltesander maltesander commented Jan 29, 2026

Description

We have some integration test failures in products using HDFS, where the format image somehow fails.

Failed to start namenode.\njava.io.FileNotFoundException: No valid image files found

and

Failed to start namenode.\norg.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not format one or more JournalNodes. 1 exceptions thrown:\n10.42.11.24:8485: End of File Exception between local host is: \"hdfs-namenode-default-0/10.42.11.25\"; destination host is: \"hdfs-journalnode-default-0.hdfs-journalnode-default.kuttl-test-calm-hookworm.svc.cluster.local\":8485; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException

We previously did not catch the init container shell output properly, which is done in this PR.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Changelog updated

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@maltesander maltesander self-assigned this Jan 29, 2026
@maltesander maltesander moved this to Development: Waiting for Review in Stackable Engineering Feb 2, 2026
@sbernauer sbernauer moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Feb 2, 2026
Copy link
Member

@siegfriedweber siegfriedweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. However, the HDFS pods often restart and the logging integration test sometimes fails because the cluster does not become ready within 10 minutes.

Copy link
Member

@lfrancke lfrancke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at it and it looks okay. I have not tested it though.

@maltesander maltesander added this pull request to the merge queue Feb 4, 2026
@maltesander maltesander moved this from Development: In Review to Development: Done in Stackable Engineering Feb 4, 2026
Merged via the queue into main with commit 8a6ab18 Feb 4, 2026
12 checks passed
@maltesander maltesander deleted the fix/hdfs-container-scrtipt-logging branch February 4, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: Done

Development

Successfully merging this pull request may close these issues.

3 participants